GRAPH DATA MANAGEMENT Udayan Khurana , Doctor of Philosophy , 2015

نویسندگان

  • Udayan Khurana
  • Amol Deshpande
چکیده

Title of dissertation: HISTORICAL GRAPH DATA MANAGEMENT Udayan Khurana, Doctor of Philosophy, 2015 Dissertation directed by: Professor Amol Deshpande Department of Computer Science Over the last decade, we have witnessed an increasing interest in temporal analysis of information networks such as social networks or citation networks. Finding temporal interaction patterns, visualizing the evolution of graph properties, or even simply comparing them across time, has proven to add significant value in reasoning over networks. However, because of the lack of underlying data management support, much of the work on large-scale graph analytics to date has largely focused on the study of static properties of graph snapshots. Unfortunately, a static view of interactions between entities is often an oversimplification of several complex phenomena like the spread of epidemics, information diffusion, formation of online communities, and so on. In the absence of appropriate support, an analyst today has to manually navigate the added temporal complexity of large evolving graphs, making the process cumbersome and ineffective. In this dissertation, I address the key challenges in storing, retrieving, and analyzing large historical graphs. In the first part, I present DeltaGraph, a novel, extensible, highly tunable, and distributed hierarchical index structure that enables compact recording of the historical information, and that supports efficient retrieval of historical graph snapshots. I present analytical models for estimating required storage space and snapshot retrieval times which aid in choosing the right parameters for a specific scenario. I also present optimizations such as partial materialization and columnar storage to speed up snapshot retrieval. In the second part, I present Temporal Graph Index that builds upon DeltaGraph to support version-centric retrieval such as a node’s 1-hop neighborhood history, along with snapshot reconstruction. It provides high scalability, employing careful partitioning, distribution, and replication strategies that effectively deal with temporal and topological skew, typical of temporal graph datasets. In the last part of the dissertation, I present Temporal Graph Analysis Framework that enables analysts to effectively express a variety of complex historical graph analysis tasks using a set of novel temporal graph operators and to execute them in an efficient and scalable manner on a cloud. My proposed solutions are engineered in the form of a framework called the Historical Graph Store, designed to facilitate a wide variety of large-scale historical graph analysis tasks. Historical Graph Data Management

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Introduction to Temporal Graph Data Management

This paper presents an introduction to the problem of temporal graph data management in the form of a survey of relevant techniques from database management and graph processing. Social network analytics, which focuses on finding interesting facts over static graphs, has gathered much attention lately. However, there hasn’t been much work on analysis of temporal or evolving graphs. We believe t...

متن کامل

GraphGen: Exploring Interesting Graphs in Relational Data

Analyzing interconnection structures among the data through the use of graph algorithms and graph analytics has been shown to provide tremendous value in many application domains. However, graphs are not the primary choice for how most data is currently stored, and users who want to employ graph analytics are forced to extract data from their data stores, construct the requisite graphs, and the...

متن کامل

Storing and Analyzing Historical Graph Data at Scale

The work on large-scale graph analytics to date has largely focused on the study of static properties of graph snapshots. However, a static view of interactions between entities is often an oversimplification of several complex phenomena like the spread of epidemics, information diffusion, formation of online communities, and so on. Being able to find temporal interaction patterns, visualize th...

متن کامل

Graph-based Exploration of Non-graph Datasets

Graphs or networks provide a powerful abstraction to view and analyze relationships among different entities present in a dataset. However, much of the data of interest to analysts and data scientists resides in non-graph forms such as relational databases, JSON, XML, CSV and text. The effort and skill required in identifying and extracting the relevant graph representation from data is often t...

متن کامل

NetEvViz: Extending NodeXL for Dynamic Network Visualization

We present NetEvViz, a visualization tool to help analysis of dynamic and static network data. While there are a few tools for analyzing static networks, where edge or node addition or deletion does not feature, there are none publicly available that deal with dynamic networks that change with time. Our tool is a Microsoft Excel plug-in and extends the codebase of NodeXL, a popular network visu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015